Monaural Speech Separation

نویسندگان

  • Guoning Hu
  • DeLiang Wang
چکیده

Monaural speech separation has been studied in previous systems that incorporate auditory scene analysis principles. A major problem for these systems is their inability to deal with speech in the highfrequency range. Psychoacoustic evidence suggests that different perceptual mechanisms are involved in handling resolved and unresolved harmonics. Motivated by this, we propose a model for monaural separation that deals with low-frequency and highfrequency signals differently. For resolved harmonics, our model generates segments based on temporal continuity and cross-channel correlation, and groups them according to periodicity. For unresolved harmonics, the model generates segments based on amplitude modulation (AM) in addition to temporal continuity and groups them according to AM repetition rates derived from sinusoidal modeling. Underlying the separation process is a pitch contour obtained according to psychoacoustic constraints. Our model is systematically evaluated, and it yields substantially better performance than previous systems, especially in the high-frequency range.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Ensemble Learning for Monaural Speech Separation

Monaural speech separation is a fundamental problem in robust speech processing. Recently, deep neural network (DNN) based speech separation methods, which predict either clean speech or an ideal time-frequency mask, have demonstrated remarkable performance improvement. However, a single DNN with a given window length does not leverage contextual information sufficiently, and the differences be...

متن کامل

Phoneme-Dependent NMF for Speech Enhancement in Monaural Mixtures

The problem of separating speech signals out of monaural mixtures (with other non-speech or speech signals) has become increasingly popular in recent times. Among the various solutions proposed, the most popular methods are based on compositional models such as non-negative matrix factorization (NMF) and latent variable models. Although these techniques are highly effective they largely ignore ...

متن کامل

Monaural Voiced Speech Segregation Based on Pitch and Comb Filter

The correlogram is an important mid-level representation for periodic sounds which is widely used in sound source separation and pitch detection. However, it is very time consuming. In this paper, we presented a novel scheme for monaural voiced speech separation without computing correlograms. The noisy speech is firstly decomposing into time-frequency units. Pitch contour of the target speech ...

متن کامل

Monaural speech separation based on MAXVQ and CASA for robust speech recognition

Robustness is one of the most important topics for automatic speech recognition (ASR) in practical applications. Monaural speech separation based on computational auditory scene analysis (CASA) offers a solution to this problem. In this paper, a novel system is presented to separate the monaural speech of two talkers. Gaussian mixture models (GMMs) and vector quantizers (VQs) are used to learn ...

متن کامل

Monaural speech/music source separation using discrete energy separation algorithm

In this paper, we address the problem of monaural source separation of a mixed signal containing speech and music components. We use Discrete Energy Separation Algorithm (DESA) to estimate frequency-modulating (FM) signal energy. The FM signal energy is used to design a time-varying filter in the time–frequency domain for rejecting the interfering signal. The FM signal energy was chosen due to ...

متن کامل

Monaural Voiced Speech Separation with Multipitch Tracking

Separating voiced speech from its mixtures with interferences in monaural condition is not only an important but also challenging task. As multipitch tracking can enable much better performance of speech separation for CASA systems, we propose a new multipitch determination algorithm, which can be used under various kinds of noise conditions. In the process of multipitch estimation, a new repre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002